2026-06-30

Stochastic Differential Equation (SDE)

A Stochastic Differential Equation (SDE) is a differential equation in which one or more terms is a stochastic process, resulting in a solution that is itself a stochastic process. SDEs are used to model systems with random fluctuations and are fundamental to [[Diffusion Model|Diffusion Models]], financial mathematics, and physics.

1. Core Concept

1.1 From ODE to SDE

Ordinary Differential Equation (ODE):

\frac{d x}{d t} = f (t, x)

Stochastic Differential Equation (SDE):

d x_{t} = f (t, x_{t}) d t + g (t, x_{t}) d W_{t}

The key difference is the addition of the stochastic term $g (t, x_{t}) d W_{t}$ , where $W_{t}$ is a [[Wiener Process|Wiener Process]].

[!NOTE] Key Insight
While ODEs describe deterministic evolution, SDEs incorporate random noise, making them suitable for modeling real-world systems with uncertainty.

1.2 Components of SDE

Component	Notation	Role
Drift coefficient	$f (t, x_{t})$ or $μ (t, x_{t})$	Deterministic trend
Diffusion coefficient	$g (t, x_{t})$ or $σ (t, x_{t})$	Noise intensity
**[[Wiener Process	Wiener Process]]**	$W_{t}$ or $B_{t}$
State variable	$x_{t}$ or $X_{t}$	System state

2. Mathematical Foundation

2.1 Itô Interpretation

The SDE is interpreted as an integral equation:

x_{t} = x_{0} + \int_{0}^{t} f (s, x_{s}) d s + \int_{0}^{t} g (s, x_{s}) d W_{s}

where the second integral is an [[Itô Integral|Itô integral]].

2.2 Itô’s Lemma

Theorem: For an Itô process:

d x_{t} = μ_{t} d t + σ_{t} d W_{t}

and a twice-differentiable function $h (t, x)$ , we have:

d h (t, x_{t}) = (\frac{\partial h}{\partial t} + μ_{t} \frac{\partial h}{\partial x} + \frac{1}{2} σ_{t}^{2} \frac{\partial^{2} h}{\partial x^{2}}) d t + σ_{t} \frac{\partial h}{\partial x} d W_{t}

[!WARNING] Itô vs Stratonovich
The extra term $\frac{1}{2} σ_{t}^{2} \frac{\partial^{2} h}{\partial x^{2}}$ is unique to Itô calculus and arises from the non-zero quadratic variation of [[Wiener Process|Wiener process]]. This term does not appear in classical calculus.

2.3 Itô’s Lemma in Multiple Dimensions

For $x_{t} \in R^{n}$ with SDE:

d x_{t} = μ_{t} d t + σ_{t} d W_{t}

and function $h (t, x) : R \times R^{n} \to R^{m}$ :

d h = (\frac{\partial h}{\partial t} + \sum_{i} μ_{t}^{i} \frac{\partial h}{\partial x_{i}} + \frac{1}{2} \sum_{i, j} (σ_{t} σ_{t}^{⊤})_{i j} \frac{\partial^{2} h}{\partial x_{i} \partial x_{j}}) d t + \sum_{i} \frac{\partial h}{\partial x_{i}} σ_{t}^{i} d W_{t}

3. Common Types of SDEs

3.1 Linear SDE

d x_{t} = (a (t) x_{t} + b (t)) d t + (c (t) x_{t} + d (t)) d W_{t}

Solution method: Use integrating factor or variation of constants.

3.2 Geometric [[Wiener Process|Brownian Motion]]

d S_{t} = μ S_{t} d t + σ S_{t} d W_{t}

Solution (using Itô’s Lemma on $\log S_{t}$ ):

S_{t} = S_{0} \exp ((μ - \frac{σ^{2}}{2}) t + σ W_{t})

Application: Stock price modeling (Black-Scholes).

3.3 Ornstein-Uhlenbeck Process

d x_{t} = - θ x_{t} d t + σ d W_{t}

Solution:

x_{t} = x_{0} e^{- θ t} + σ \int_{0}^{t} e^{- θ (t - s)} d W_{s}

Properties:

Mean-reverting to zero
Stationary distribution: $x_{\infty} \sim N (0, \frac{σ^{2}}{2 θ})$
Used in interest rate models (Vasicek model)

3.4 Variance-Preserving SDE (VP-SDE)

d x_{t} = - \frac{1}{2} β (t) x_{t} d t + \sqrt{β (t)} d W_{t}

Properties:

Marginal distribution: $x_{t} ∣ x_{0} \sim N (e^{- \frac{1}{2} \int_{0}^{t} β (s) d s} x_{0}, (1 - e^{- \int_{0}^{t} β (s) d s}) I)$
Preserves variance in $[0, 1]$
Primary choice for [[Diffusion Model|Diffusion Models]]

3.5 Variance-Exploding SDE (VE-SDE)

d x_{t} = \sqrt{\frac{d σ^{2} (t)}{d t}} d W_{t}

Properties:

Variance grows with time: $Var (x_{t}) = σ^{2} (t)$
No drift term
Alternative formulation for diffusion models

4. Solving SDEs

4.1 Analytical Solutions

Some SDEs have closed-form solutions:

SDE	Solution	Method
$d x_{t} = μ d t + σ d W_{t}$	$x_{t} = x_{0} + μ t + σ W_{t}$	Direct integration
$d S_{t} = μ S_{t} d t + σ S_{t} d W_{t}$	$S_{t} = S_{0} e^{(μ - σ^{2} / 2) t + σ W_{t}}$	Itô’s Lemma
$d x_{t} = - θ x_{t} d t + σ d W_{t}$	$x_{t} = x_{0} e^{- θ t} + σ \int_{0}^{t} e^{- θ (t - s)} d W_{s}$	Integrating factor

4.2 Numerical Methods

Euler-Maruyama Method

Simplest discretization:

x_{t + Δ t} = x_{t} + f (t, x_{t}) Δ t + g (t, x_{t}) \sqrt{Δ t} ϵ, ϵ \sim N (0, 1)

Convergence: Strong order 0.5, weak order 1.0

Milstein Method

Higher-order method:

x_{t + Δ t} = x_{t} + f Δ t + g Δ W + \frac{1}{2} g \frac{\partial g}{\partial x} ((Δ W)^{2} - Δ t)

Convergence: Strong order 1.0

# Pseudocode: Euler-Maruyama Method
def euler_maruyama(f, g, x0, T, N):
    dt = T / N
    x = x0
    path = [x0]
    
    for i in range(N):
        dW = sqrt(dt) * sample_normal(0, 1)
        x = x + f(t, x) * dt + g(t, x) * dW
        path.append(x)
    
    return path

5. Fokker-Planck Equation

5.1 Forward Kolmogorov Equation

The probability density $p (t, x)$ of the SDE solution satisfies:

\frac{\partial p}{\partial t} = - \frac{\partial}{\partial x} [f (t, x) p] + \frac{1}{2} \frac{\partial^{2}}{\partial x^{2}} [g (t, x)^{2} p]

This is the Fokker-Planck equation (or forward Kolmogorov equation).

5.2 Stationary Distribution

For time-homogeneous SDE $d x_{t} = f (x_{t}) d t + g (x_{t}) d W_{t}$ , the stationary distribution $p_{\infty} (x)$ satisfies:

0 = - \frac{d}{d x} [f (x) p_{\infty} (x)] + \frac{1}{2} \frac{d^{2}}{d x^{2}} [g (x)^{2} p_{\infty} (x)]

Solution (for $g (x) = σ$ constant):

p_{\infty} (x) \propto \exp (\frac{2}{σ^{2}} \int^{x} f (y) d y)

5.3 Connection to [[Probability Flow ODE|Probability Flow ODE]]

The Fokker-Planck equation can be written as a continuity equation:

\frac{\partial p}{\partial t} = - \nabla \cdot (v p)

where the velocity field $v$ defines the [[Probability Flow ODE]] with the same marginals.

6. SDEs in Diffusion Models

6.1 Forward Process

In diffusion models, data $x_{0}$ is gradually corrupted by noise:

d x_{t} = f (t) x_{t} d t + g (t) d W_{t}

Common choices:

SDE Type	$f (t)$	$g (t)$	Marginal
VP-SDE	$- \frac{1}{2} β (t)$	$\sqrt{β (t)}$	$N (e^{- \frac{1}{2} \int β} x_{0}, (1 - e^{- \int β}) I)$
VE-SDE	$0$	$\sqrt{β (t)}$	$N (x_{0}, \int_{0}^{t} β (s) d s \cdot I)$
Sub-Variance	$- β (t)$	$\sqrt{2 β (t)}$	Interpolates between VP and VE

6.2 Reverse Process

The time-reversal of an SDE (Anderson, 1982):

d x_{t} = [f (t) x_{t} - g (t)^{2} \nabla_{x} \log p_{t} (x)] d t + g (t) d {\bar{W}}_{t}

where:

${\bar{W}}_{t}$ is reverse-time [[Wiener Process|Wiener process]]
$\nabla_{x} \log p_{t} (x)$ is the [[Score Function|score function]]
Must be solved backward from $t = T$ to $t = 0$

6.3 Score Matching

Learn the score function with neural network $s_{θ} (x, t) \approx \nabla_{x} \log p_{t} (x)$ :

Objective:

L (θ) = E_{t, x_{0}, x_{t}} [∥ s_{θ} (x_{t}, t) - \nabla_{x_{t}} \log p (x_{t} ∣ x_{0}) ∥^{2}]

For VP-SDE:

\nabla_{x_{t}} \log p (x_{t} ∣ x_{0}) = - \frac{x_{t} - e^{- \frac{1}{2} \int_{0}^{t} β (s) d s} x_{0}}{1 - e^{- \int_{0}^{t} β (s) d s}}

7. Beyond SDE: ODE Equivalence and Fast Sampling

7.1 From SDE to [[Probability Flow ODE]]

The reverse SDE can be deterministically transformed into an ODE with identical marginals:

d x = [f (t) x - \frac{1}{2} g (t)^{2} \nabla_{x} \log p_{t} (x)] d t

This [[Probability Flow ODE]] enables:

Deterministic sampling (same noise → same output)
Exact likelihood computation via instantaneous change of variables
Inversion of real data to latent space

7.2 Fast SDE Sampling Methods

Method	Type	Steps	Key Idea
DDPM	SDE	1000	Original Markov chain
[[DDIM]]	Non-Markovian	50-100	Relaxes Markov assumption
[[DPM-Solver]]	ODE	10-20	Semi-linear structure exploitation
Predictor-Corrector	SDE+ODE	20-50	Langevin refinement steps

7.3 Numerical Stability in SDE Solvers

Key challenges for diffusion model SDEs:

Stiffness near $t = 0$ : VP-SDE becomes very stiff
- Fix: Non-uniform time discretization, more steps near $t = 0$
Score function explosion: As $t \to 0$ , the score $\nabla_{x} \log p_{t} (x)$ diverges
- Fix: Use $ϵ$ -prediction instead of direct score prediction
Discretization error accumulation: Euler-Maruyama error propagates
- Fix: Higher-order methods (Milstein, [[DPM-Solver]])
SDE vs ODE trade-off:
- SDE: Better quality (injecting fresh noise), slower
- ODE: Faster, deterministic, better for inversion

8. Existence and Uniqueness

8.1 Lipschitz Conditions

Theorem: If $f (t, x)$ and $g (t, x)$ satisfy:

Lipschitz condition: $| f (t, x) - f (t, y) | + | g (t, x) - g (t, y) | \leq K | x - y |$
Linear growth: $| f (t, x) |^{2} + | g (t, x) |^{2} \leq K (1 + | x |^{2})$

then the SDE has a unique strong solution.

8.2 Weak vs Strong Solutions

Type	Definition	Requirement
Strong solution	$x_{t}$ adapted to filtration of $W_{t}$	Fixed probability space
Weak solution	Existence of $(x_{t}, W_{t})$ jointly	Distribution equivalence

9. Girsanov Theorem

9.1 Change of Measure

Theorem: Under suitable conditions, we can change measure $P \to Q$ to eliminate drift:

If $d x_{t} = μ_{t} d t + σ_{t} d W_{t}^{P}$ , then under $Q$ :

d x_{t} = σ_{t} d W_{t}^{Q}

where $d W_{t}^{Q} = d W_{t}^{P} + \frac{μ_{t}}{σ_{t}} d t$ .

9.2 Applications

Risk-neutral pricing in finance
Importance sampling for variance reduction
Likelihood ratio computation in diffusion models

10. Core Formula Cards

[!QUOTE] General SDE
$d x_{t} = f (t, x_{t}) d t + g (t, x_{t}) d W_{t}$

[!QUOTE] Itô’s Lemma
$d h = (\frac{\partial h}{\partial t} + f \frac{\partial h}{\partial x} + \frac{1}{2} g^{2} \frac{\partial^{2} h}{\partial x^{2}}) d t + g \frac{\partial h}{\partial x} d W_{t}$

[!QUOTE] Geometric [[Wiener Process|Brownian Motion]]
$d S_{t} = μ S_{t} d t + σ S_{t} d W_{t}$

[!QUOTE] Ornstein-Uhlenbeck Process
$d x_{t} = - θ x_{t} d t + σ d W_{t}$

[!QUOTE] VP-SDE (Diffusion Models)
$d x_{t} = - \frac{1}{2} β (t) x_{t} d t + \sqrt{β (t)} d W_{t}$

[!QUOTE] Reverse-Time SDE
$d x_{t} = [f (t) x_{t} - g (t)^{2} \nabla_{x} \log p_{t} (x)] d t + g (t) d {\bar{W}}_{t}$

[!QUOTE] Fokker-Planck Equation
$\frac{\partial p}{\partial t} = - \frac{\partial}{\partial x} [f p] + \frac{1}{2} \frac{\partial^{2}}{\partial x^{2}} [g^{2} p]$

[!QUOTE] Euler-Maruyama Discretization
$x_{t + Δ t} = x_{t} + f (t, x_{t}) Δ t + g (t, x_{t}) \sqrt{Δ t} ϵ$

[[Wiener Process|Wiener Process]]
[[Itô Integral]]
[[Itô’s Lemma]]
[[Martingale]]
[[Diffusion Model]]
[[Probability Flow ODE]]
[[Score Function]]
[[DDIM]]
[[DPM-Solver]]
[[Flow Matching]]
[[Markov Process]]
[[Fokker-Planck Equation]]
[[Kolmogorov Equations]]
[[Langevin Dynamics]]
[[Stochastic Process]]
[[Brownian Motion]]

Dataview Query

1
2
3

LIST
FROM #sde OR #stochastic_calculus
SORT file.ctime DESC

References

Book: Stochastic Differential Equations - Bernt Øksendal
Book: [[Wiener Process|Brownian Motion]] and Stochastic Calculus - Karatzas & Shreve
Paper: Score-Based Generative Modeling through SDEs (Song et al., 2021)
Paper: Maximum Likelihood Training of Score-Based Diffusion Models (Song et al., 2021)
Course: MIT 18.S096 Topics in Mathematics with Applications in Finance
Course: CS236 Deep Generative Models (Stanford)

Stochastic Differential Equation (SDE)

1. Core Concept

1.1 From ODE to SDE

1.2 Components of SDE

2. Mathematical Foundation

2.1 Itô Interpretation

2.2 Itô’s Lemma

2.3 Itô’s Lemma in Multiple Dimensions

3. Common Types of SDEs

3.1 Linear SDE

3.2 Geometric [[Wiener Process|Brownian Motion]]

3.3 Ornstein-Uhlenbeck Process

3.4 Variance-Preserving SDE (VP-SDE)

3.5 Variance-Exploding SDE (VE-SDE)

4. Solving SDEs

4.1 Analytical Solutions

4.2 Numerical Methods

Euler-Maruyama Method

Milstein Method

5. Fokker-Planck Equation

5.1 Forward Kolmogorov Equation

5.2 Stationary Distribution

5.3 Connection to [[Probability Flow ODE|Probability Flow ODE]]

6. SDEs in Diffusion Models

6.1 Forward Process

6.2 Reverse Process

6.3 Score Matching

7. Beyond SDE: ODE Equivalence and Fast Sampling

7.1 From SDE to [[Probability Flow ODE]]

7.2 Fast SDE Sampling Methods

7.3 Numerical Stability in SDE Solvers

8. Existence and Uniqueness

8.1 Lipschitz Conditions

8.2 Weak vs Strong Solutions

9. Girsanov Theorem

9.1 Change of Measure

9.2 Applications

10. Core Formula Cards

Related Concepts

Dataview Query

References